AITopics | transformation parameter

Collaborating Authors

transformation parameter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Synthetic-to-Real Pose Estimation with Geometric Reconstruction Qiuxia Lin 1 Kerui Gu1 Linlin Y ang 2, 3 Angela Y ao 1 1

Neural Information Processing SystemsFeb-16-2026, 09:31:26 GMT

Pose estimation is remarkably successful under supervised learning, but obtaining annotations, especially for new deployments, is costly and time-consuming. This work tackles adapting models trained on synthetic data to real-world target domains with only unlabelled data. A common approach is model fine-tuning with pseudo-labels from the target domain; yet many pseudo-labelling strategies cannot provide sufficient high-quality pose labels. This work proposes a reconstruction-based strategy as a complement to pseudo-labelling for synthetic-to-real domain adaptation. We generate the driving image by geometrically transforming a base image according to the predicted keypoints and enforce a reconstruction loss to refine the predictions. It provides a novel solution to effectively correct confident yet inaccurate keypoint locations through image reconstruction in domain adaptation. Our approach outperforms the previous state-of-the-arts by 8% for PCK on four large-scale hand and human real-world datasets. In particular, we excel on endpoints such as fingertips and head, with 7.2% and 29.9% improvements in PCK.

artificial intelligence, machine learning, reconstruction, (20 more...)

Neural Information Processing Systems

Country: Asia > Singapore (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Geometrically Constrained and Token-Based Probabilistic Spatial Transformers

Schmidt, Johann, Stober, Sebastian

arXiv.org Artificial IntelligenceSep-16-2025

Fine-grained visual classification (FGVC) remains highly sensitive to geometric variability, where objects appear under arbitrary orientations, scales, and perspective distortions. While equivariant architectures address this issue, they typically require substantial computational resources and restrict the hypothesis space. We revisit Spatial Transformer Networks (STNs) as a canonicalization tool for transformer-based vision pipelines, emphasizing their flexibility, backbone-agnostic nature, and lack of architectural constraints. We propose a probabilistic, component-wise extension that improves robustness. Specifically, we decompose affine transformations into rotation, scaling, and shearing, and regress each component under geometric constraints using a shared localization encoder. To capture uncertainty, we model each component with a Gaussian variational posterior and perform sampling-based canonicalization during inference.A novel component-wise alignment loss leverages augmentation parameters to guide spatial alignment. Experiments on challenging moth classification benchmarks demonstrate that our method consistently improves robustness compared to other STNs.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.11218

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

MoNetV2: Enhanced Motion Network for Freehand 3D Ultrasound Reconstruction

Luo, Mingyuan, Yang, Xin, Yan, Zhongnuo, Cao, Yan, Zhang, Yuanji, Hu, Xindi, Wang, Jin, Ding, Haoxuan, Han, Wei, Sun, Litao, Ni, Dong

arXiv.org Artificial IntelligenceJun-23-2025

Abstract--Three-dimensional (3D) ultrasound (US) aims to provide sonographers with the spatial relationships of ana tomical structures, playing a crucial role in clinical diagnosis. R ecently, deep-learning-based freehand 3D US has made significant advancements. However, i mage-only reconstruction poses difficulties in reducing cumulat ive drift and further improving reconstruction accuracy, particula rly in scenarios involving complex motion trajectories. In this c ontext, we propose an enhanced motion network (MoNetV2) to enhance the accuracy and generalizability of reconstruction under diverse scanning velocities and tactics. First, we propose a sensor -based temporal and multi-branch structure that fuses image and mo tion information from a velocity perspective to improve image-o nly reconstruction accuracy. Second, we devise an online multi -level consistency constraint that exploits the inherent consist ency of scans to handle various scanning velocities and tactics. Th is constraint exploits both scan-level velocity consistency, path-level appearance consistency, and patch-level motion consisten cy to supervise inter-frame transformation estimation. Third, we distill an online multi-modal self-supervised strategy that lever ages the correlation between network estimation and motion informa tion to further reduce cumulative errors. Extensive experiment s clearly demonstrate that MoNetV2 surpasses existing metho ds in both reconstruction quality and generalizability perfo rmance across three large datasets. L TRASOUND (US) imaging plays an important role in clinical monitoring and diagnosis because of its non-invasiveness, real-time, and mobility [ 1 ]. This work was supported by the National Natural Science Foun dation of China (Nos. Jin Wang and Litao Sun are with the Cancer C enter, Department of Ultrasound Medicine, Zhejiang Provincial Pe ople's Hospital, Affiliated People's Hospital of Hangzhou Medical Colle ge, Hangzhou, Zhejiang, China. Wei Han is with the Department of Health Man agement Center, Qilu Hospital, Cheeloo College of Medicine, Shando ng University, Jinan, Shandong, China. Its applications span vari ous fields such as heart [ 2 ], fetus [ 3 ], breast [ 4 ], and liver [ 5 ]. Traditional 3D US imaging methods encompass mechanical, phased array, and freehand techniques. Mechanical and phas ed array imaging often suffer from specialized and expensive hardware with a limited field of view.

artificial intelligence, machine learning, monetv2, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TNNLS.2025.3573210

2506.15835

Country:

Asia > China > Zhejiang Province > Hangzhou (0.44)
Asia > China > Shandong Province (0.24)
Asia > China > Guangdong Province > Shenzhen (0.04)
(3 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

IPFed: Identity protected federated learning for user authentication

Kaga, Yosuke, Suzuki, Yusei, Takahashi, Kenta

arXiv.org Artificial IntelligenceMay-6-2024

With the development of laws and regulations related to privacy preservation, it has become difficult to collect personal data to perform machine learning. In this context, federated learning, which is distributed learning without sharing personal data, has been proposed. In this paper, we focus on federated learning for user authentication. We show that it is difficult to achieve both privacy preservation and high accuracy with existing methods. To address these challenges, we propose IPFed which is privacy-preserving federated learning using random projection for class embedding. Furthermore, we prove that IPFed is capable of learning equivalent to the state-of-the-art method. Experiments on face image datasets show that IPFed can protect the privacy of personal data while maintaining the accuracy of the state-of-the-art method.

federated learning, ipfed, server, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/APSIPAASC58517.2023.10317108

2405.03955

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Japan (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.68)

Add feedback

BreastRegNet: A Deep Learning Framework for Registration of Breast Faxitron and Histopathology Images

Golestani, Negar, Wang, Aihui, Bean, Gregory R, Rusu, Mirabela

arXiv.org Artificial IntelligenceJan-18-2024

A standard treatment protocol for breast cancer entails administering neoadjuvant therapy followed by surgical removal of the tumor and surrounding tissue. Pathologists typically rely on cabinet X-ray radiographs, known as Faxitron, to examine the excised breast tissue and diagnose the extent of residual disease. However, accurately determining the location, size, and focality of residual cancer can be challenging, and incorrect assessments can lead to clinical consequences. The utilization of automated methods can improve the histopathology process, allowing pathologists to choose regions for sampling more effectively and precisely. Despite the recognized necessity, there are currently no such methods available. Training such automated detection models require accurate ground truth labels on ex-vivo radiology images, which can be acquired through registering Faxitron and histopathology images and mapping the extent of cancer from histopathology to x-ray images. This study introduces a deep learning-based image registration approach trained on mono-modal synthetic image pairs. The models were trained using data from 50 women who received neoadjuvant chemotherapy and underwent surgery. The results demonstrate that our method is faster and yields significantly lower average landmark error ($2.1\pm1.96$ mm) over the state-of-the-art iterative ($4.43\pm4.1$ mm) and deep learning ($4.02\pm3.15$ mm) approaches. Improved performance of our approach in integrating radiology and pathology information facilitates generating large datasets, which allows training models for more accurate breast cancer detection.

faxitron and histopathology image, histopathology image, registration, (14 more...)

arXiv.org Artificial Intelligence

2401.09791

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Deep Registration Method for Accurate Quantification of Joint Space Narrowing Progression in Rheumatoid Arthritis

Wang, Haolin, Ou, Yafei, Fang, Wanxuan, Ambalathankandy, Prasoon, Goto, Naoto, Ota, Gen, Ikebe, Masayuki, Kamishima, Tamotsu

arXiv.org Artificial IntelligenceApr-26-2023

Rheumatoid arthritis (RA) is a chronic autoimmune inflammatory disease that results in progressive articular destruction and severe disability. Joint space narrowing (JSN) progression has been regarded as an important indicator for RA progression and has received sustained attention. In the diagnosis and monitoring of RA, radiology plays a crucial role to monitor joint space. A new framework for monitoring joint space by quantifying JSN progression through image registration in radiographic images has been developed. This framework offers the advantage of high accuracy, however, challenges do exist in reducing mismatches and improving reliability. In this work, a deep intra-subject rigid registration network is proposed to automatically quantify JSN progression in the early stage of RA. In our experiments, the mean-square error of Euclidean distance between moving and fixed image is 0.0031, standard deviation is 0.0661 mm, and the mismatching rate is 0.48\%. The proposed method has sub-pixel level accuracy, exceeding manual measurements by far, and is equipped with immune to noise, rotation, and scaling of joints. Moreover, this work provides loss visualization, which can aid radiologists and rheumatologists in assessing quantification reliability, with important implications for possible future clinical applications. As a result, we are optimistic that this proposed work will make a significant contribution to the automatic quantification of JSN progression in RA.

artificial intelligence, machine learning, progression, (19 more...)

arXiv.org Artificial Intelligence

2304.13938

Country:

Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.06)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Rheumatology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

No Free Lunch in Self Supervised Representation Learning

Bendidi, Ihab, Bardes, Adrien, Cohen, Ethan, Lamiable, Alexis, Bollot, Guillaume, Genovesio, Auguste

arXiv.org Artificial IntelligenceApr-23-2023

Self-supervised representation learning in computer vision relies heavily on hand-crafted image transformations to learn meaningful and invariant features. However few extensive explorations of the impact of transformation design have been conducted in the literature. In particular, the dependence of downstream performances to transformation design has been established, but not studied in depth. In this work, we explore this relationship, its impact on a domain other than natural images, and show that designing the transformations can be viewed as a form of supervision. First, we demonstrate that not only do transformations have an effect on downstream performance and relevance of clustering, but also that each category in a supervised dataset can be impacted in a different way. Following this, we explore the impact of transformation design on microscopy images, a domain where the difference between classes is more subtle and fuzzy than in natural images. In this case, we observe a greater impact on downstream tasks performances. Finally, we demonstrate that transformation design can be leveraged as a form of supervision, as careful selection of these by a domain expert can lead to a drastic increase in performance on a given downstream task.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2304.11718

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

3D Labeling Tool

Rachwan, John, Zalaket, Charbel

arXiv.org Artificial IntelligenceJul-23-2022

Training and testing supervised object detection models require a large collection of images with ground truth labels. Labels define object classes in the image, as well as their locations, shape, and possibly other information such as pose. The labeling process has proven extremely time consuming, even with the presence of manpower. We introduce a novel labeling tool for 2D images as well as 3D triangular meshes: 3D Labeling Tool (3DLT). This is a standalone, feature-heavy and cross-platform software that does not require installation and can run on Windows, macOS and Linux-based distributions. Instead of labeling the same object on every image separately like current tools, we use depth information to reconstruct a triangular mesh from said images and label the object only once on the aforementioned mesh. We use registration to simplify 3D labeling, outlier detection to improve 2D bounding box calculation and surface reconstruction to expand labeling possibility to large point clouds. Our tool is tested against state of the art methods and it greatly surpasses them in terms of speed while preserving accuracy and ease of use.

data mining, machine learning, programming language, (25 more...)

arXiv.org Artificial Intelligence

2207.11479

Country:

North America > United States > California > San Francisco County > San Francisco (0.13)
North America > United States > California > Los Angeles County > Los Angeles (0.13)
Europe > Austria > Vienna (0.13)
(12 more...)

Genre:

Workflow (1.00)
Overview (1.00)
Research Report > Promising Solution (0.48)

Industry:

Information Technology (1.00)
Leisure & Entertainment (0.67)
Media > Photography (0.45)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Human Computer Interaction > Interfaces (1.00)
(12 more...)

Add feedback

Quantised Transforming Auto-Encoders: Achieving Equivariance to Arbitrary Transformations in Deep Networks

Jiao, Jianbo, Henriques, João F.

arXiv.org Artificial IntelligenceNov-24-2021

In this work we investigate how to achieve equivariance to input transformations in deep networks, purely from data, without being given a model of those transformations. Convolutional Neural Networks (CNNs), for example, are equivariant to image translation, a transformation that can be easily modelled (by shifting the pixels vertically or horizontally). Other transformations, such as out-of-plane rotations, do not admit a simple analytic model. We propose an auto-encoder architecture whose embedding obeys an arbitrary set of equivariance relations simultaneously, such as translation, rotation, colour changes, and many others. This means that it can take an input image, and produce versions transformed by a given amount that were not observed before (e.g. a different point of view of the same object, or a colour variation). Despite extending to many (even non-geometric) transformations, our model reduces exactly to a CNN in the special case of translation-equivariance. Equivariances are important for the interpretability and robustness of deep networks, and we demonstrate results of successful re-rendering of transformed versions of input images on several synthetic and real datasets, as well as results on object pose estimation.

equivariance, henriques, transformation, (13 more...)

arXiv.org Artificial Intelligence

2111.12873

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Filters

Collaborating Authors

transformation parameter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Synthetic-to-Real Pose Estimation with Geometric Reconstruction Qiuxia Lin 1 Kerui Gu1 Linlin Y ang 2, 3 Angela Y ao 1 1

a8223b0ad64007423ffb308b0dd92298-Paper-Conference.pdf

Geometrically Constrained and Token-Based Probabilistic Spatial Transformers

MoNetV2: Enhanced Motion Network for Freehand 3D Ultrasound Reconstruction

IPFed: Identity protected federated learning for user authentication

BreastRegNet: A Deep Learning Framework for Registration of Breast Faxitron and Histopathology Images

A Deep Registration Method for Accurate Quantification of Joint Space Narrowing Progression in Rheumatoid Arthritis

No Free Lunch in Self Supervised Representation Learning

3D Labeling Tool

Quantised Transforming Auto-Encoders: Achieving Equivariance to Arbitrary Transformations in Deep Networks